Defining functional distance using manifold embeddings of gene ontology annotations.
نویسندگان
چکیده
Although rigorous measures of similarity for sequence and structure are now well established, the problem of defining functional relationships has been particularly daunting. Here, we present several manifold embedding techniques to compute distances between Gene Ontology (GO) functional annotations and consequently estimate functional distances between protein domains. To evaluate accuracy, we correlate the functional distance to the well established measures of sequence, structural, and phylogenetic similarities. Finally, we show that manual classification of structures into folds and superfamilies is mirrored by proximity in the newly defined function space. We show how functional distances place structure-function relationships in biological context resulting in insight into divergent and convergent evolution. The methods and results in this paper can be readily generalized and applied to a wide array of biologically relevant investigations, such as accuracy of annotation transference, the relationship between sequence, structure, and function, or coherence of expression modules.
منابع مشابه
Using computational predictions to improve literature-based Gene Ontology annotations: a feasibility study
Annotation using Gene Ontology (GO) terms is one of the most important ways in which biological information about specific gene products can be expressed in a searchable, computable form that may be compared across genomes and organisms. Because literature-based GO annotations are often used to propagate functional predictions between related proteins, their accuracy is critically important. We...
متن کاملInferring gene annotations in Gene Ontology from gene expression data
Motivation: The Gene Ontology (GO) project develops a standard way to describe gene products in terms of their associated biological processes, cellular components and molecular functions. However, it is far from complete. Due to lacking biological knowledge or other technical difficulties, many gene products do not have GO annotations, and the annotations for many other gene products are not s...
متن کاملUnderstanding how and why the Gene Ontology and its annotations evolve: the GO within UniProt
The Gene Ontology Consortium (GOC) is a major bioinformatics project that provides structured controlled vocabularies to classify gene product function and location. GOC members create annotations to gene products using the Gene Ontology (GO) vocabularies, thus providing an extensive, publicly available resource. The GO and its annotations to gene products are now an integral part of functional...
متن کاملEstimating the Quality of Ontology-Based Annotations by Considering Evolutionary Changes
Ontology-based annotations associate objects, such as genes and proteins, with well-defined ontology concepts to semantically and uniformly describe object properties. Such annotation mappings are utilized in different applications and analysis studies whose results strongly depend on the quality of the used annotations. To study the quality of annotations we propose a generic evaluation approa...
متن کاملCorrelating Information Contents of Gene Ontology Terms to Infer Semantic Similarity of Gene Products
Successful applications of the gene ontology to the inference of functional relationships between gene products in recent years have raised the need for computational methods to automatically calculate semantic similarity between gene products based on semantic similarity of gene ontology terms. Nevertheless, existing methods, though having been widely used in a variety of applications, may sig...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Proceedings of the National Academy of Sciences of the United States of America
دوره 104 27 شماره
صفحات -
تاریخ انتشار 2007